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Abstract 

We present the first study to explore the use of out-of-turn interaction in websites. Out-of-tum in- 
teraction is a technique which empowers the user to supply unsolicited information while browsing. 
This approach helps flexibly bridge any mental mismatch between the user and the website, in a manner 
fundamentally different from faceted browsing and site-specific search tools. We built a user inter- 
face (Extempore) which accepts out-of-turn input via voice or text; and employed it in a US congres- 
sional website, to determine if users utilize out-of-turn interaction for information-finding tasks, and 
their rationale for doing so. The results indicate that users are adept at discerning when out-of-turn inter- 
action is necessary in a particular task, and actively interleaved it with browsing. However, users found 
cascading information across information-finding subtasks challenging. Therefore, this work not only 
improves our understanding of out-of-turn interaction, but also suggests further opportunities to enrich 
browsing experiences for users. 

Categories and Subject Descriptors: H.5.2 [User Interfaces]: Interaction Styles; H.5.4 [Hypertext/ 
Hypermedia] : Navigation. 



Keywords: out-of-turn interaction, web interactions, user study, user interfaces, browsing, interactive 
information retrieval. 



1 Introduction 



It is now well accepted that flexible and contextual web browsing is imperative for customizing information 
access. Many solutions have been proposed — faceted browsing Q, personalized search flU . integrating 
searching and browsing ||9|, and contextual presentation of results |6l — all of which aim to support the 
user in achieving his or her information seeking goals. The scope of such research entails the develop- 
ment of new interaction techniques [4, ^, designing interfaces to support these techniques t2J> and study- 
ing llSl/modeling l3ll information-seeking strategies employed by users. Many of these projects have had 
qualified success, and one would be tempted to surmise that all dimensions of research have been thor- 
oughly explored. In this paper, we identify an additional dimension of information access that suggests a 
novel technique for interacting with websites. 

1.1 Setting 

Consider a US Congressional website organized in a hierarchical manner, where the site requires the user 
to progressively make choices of politician attributes — state at the first level, branch at the second level, 
followed by levels for party, and district/seat — by browsing. Imagine how a user would pursue the following 
tasks: 

1. Find the webpage of the Democratic Representative from District 17 of Florida. 

2. Find the webpage of each Democratic Senator. 

The first task can be satisfied by typical drill-down browsing because it involves supplying in-tum, 
or responsive, information at each level (see Fig. By in-turn, we mean that the user need only click 
on presented hyperlinks (click 'Florida' first, 'House' next, and so on). Each click communicates partial 
information about the desired politician. Achieving the second task by communicating only in-turn inform- 
ation would require a painful series of drill-downs and roU-ups, in order to identify the states that have at 
least one Democratic Senator, and to aggregate the results. While the user has partial information about the 
desired politicians, s/he is unable to communicate it by in-turn means. 

The key observation here is that flexibility of information access will be enriched by increasing the 
means for supplying partial information. Ideally, the user, having seen that s/he does not have the partial 
information requested at the top level (i.e., state), would have liked to supply the information that s/he does 
have, namely that of party and branch of Congress. 

1.2 Solution Approach 

Out-of-tum interaction is our solution to support flexible communication of partial information not currently 
requested by the system. Hence such information is unsolicited but presumably relevant to the information- 
seeking task. Out-of-tum interaction is thus unintrusive, optional, and can be introduced at multiple points 
in a browsing session, at the user's discretion. One possible means to support it is to allow the speaking of 
utterances into the browser. 

Figure |2l describes using out-of-turn interaction to achieve Task 2 above. At the top level of the site, the 
user is unable to make a choice of state, because s/he is looking for states that have Democratic Senators. 
S/he thus speaks 'Democrat' out-of-turn, causing some states to be pruned out (e.g., Alaska). At the second 
step, the site again solicits state information because this aspect has not yet been communicated by the user. 
The user speaks 'Senate' out-of-turn, causing further pruning (e.g., of American Samoa), and retaining only 
regions that have Democratic Senators. At this point, the goal has been achieved (the user notices 31 states 
satisfying the criteria), and s/he proceeds to browse through the remaining hyperlinks. Notice that these are 
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Figure 1 : In-turn interaction with a Congressional site. 

contextually relevant to the partial information supplied thus far, so that when 'Georgia' is clicked, there is 
only one choice of seat (Senior) implying that the other Senatorial seat is not occupied by a Democrat. 

2 Out-of-turn Interaction 

What does it mean to interact out-of-turn? One interpretation is that, when the user speaks 'Democrat', 
s/he is desiring to experience an interaction sequence through the site containing 'Democrat.' The implicit 
assumption in the cuiTcnt implementation is that what is spoken is a link label (or variation thereof) nested 
deeper in the site, and hence an in- vocabulary utterance^. Therefore, out-of-turn interaction is merely a 
mechanism to address alternate aspects of the given activity, while postponing the specification of currently 
solicited aspects. 

Why would users interact out-of-turn? There are several reasons. First, what the site is requesting from 
the user may actually be what the user is seeking in the first place! For example, in Figure|2l the site is solic- 
iting state but the user is looking for states with a certain property. Second, being able to speak out-of-turn 
in an otherwise hardwired site permits the reahzation of interaction sequences not describable by brows- 
ing. This means that we can support all permutations of specifying politician attributes, without explicitly 
enumerating in-turn choices. The above example would require 4!=24 faceted browsing classifications to 
support all tasks (i.e., browse by paity-state-branch-district, by state-party-branch-district, and so on). Third, 
the incorporation of out-of-turn information does not curb the interaction (i.e., the levelwise organization is 
preserved), but rather situates future interactions in the context of past ones. More fundamentally, out-of- 
turn interaction is a novel way to flexibly bridge any mental mismatch between the user and the website, 
without anticipating when the mismatch might happen. 

2.1 Related Research 

To better understand the merits of out-of-turn interaction, we showcase related research in a three-dimensional 
space (see Figure |3ll involving: (i) the nature of information exploited, (ii) the level of context supported, 

'in other implementations, we might conduct a more elaborate modeling of the vocabulary. 
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Figure 2: A web session illustrating the use of out-of-turn interaction in a US congressional site. This 
progression of interactions shows how the (Democrat, Senate, Georgia, Senior) interaction sequence, which 
is indescribable by browsing, may be realized. In steps 1 and 2, 'Democrat' and 'Senate' are spoken out-of- 
turn (resp.) when the systems solicits for state. In step 3, the user clicks 'Georgia' as the state (an in-turn 
input). The screen at Step 4 shows that only the Senior Senator from Georgia is a Democrat, and leads the 
user to his homepage. 
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Figure 3: Three dimensional space showcasing related research. Each of the shaded clusters denotes a 
concerted group of projects discussed in the main text. 

and (iii) the interaction technique. 

The first axis distinguishes between the specification of partial vs. complete information. Supporting 
only the specification of complete information means that interaction is viewed as a one-shot activity; sup- 
porting specification of partial information implies that information-seeking is to be conducted over multiple 
steps. Since the complete information approach is more restrictive than the partial information approach, 
it is situated toward the origin. The second axis makes a distinction of whether input or results (or both) 
are contextually qualified in some manner. Our contribution to this space is the third dimension of whether 
interaction occurs by in-turn or out-of-turn means. 

Search engines (e.g., Google) are characterized by specification of complete information (in this case, 
the query), because the interaction is terminated by returning a flat list of results. Such a low-context, 
complete information approach is denoted by the origin in Fig. |3] Browsing, on the other hand, involves the 
incremental specification of partial information (right of origin in Fig.|3ll. 

When we take context into account, two further clusters of projects emerge in the in-tum plane spanning 
the (information x context) axes. When only complete information is supported, results presentation provide 
the major opportunity for exhibiting context (front left corner of Figure |3ll. This is seen in site-specific 
seaixh tools (e.g., at Amazon.com), in the contextual search of Dumais et al. |61, and the personalized 
search strategies of Pitkow et al. flTl . The more dense cluster (front right of Figure [Sjl forms in the partial 
information region. These are projects that support contextual information access by providing either greater 
input flexibility or adaptable display of results over the course of an interaction, or both. Faceted (flat or 
hieraixhical) organizations E1EI> Dynamic Taxonomies il4i . Strategy Hubs [41, adaptive hypermedia iSj, 
and ScentTrails |9l are examples. We discuss these further. 

Sites and systems exposing faceted browsing (e.g., epicurious.com) support multiple classifications by 
providing enumerated in-turn choices. This often leads to cumbersome site designs and a mushrooming 
of possible choices at each step. The Dynamic Taxonomies project provides in-turn operators for pruning 
information hierarchies, while Strategy Hubs enumerates templates for prolonged and detailed information- 
seeking tasks, again in-tum. The adaptive hypermedia projects employ user models (e.g., of past browsing 
behavior) to tailor the presentation of hyperlinks. ScentTrails argues that browsing may not be focused 
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Figure 4: The Extempore out-of-turn interaction toolbar. This interface is embedded into a traditional web 
browser to augment hyperlink interaction. User has supplied Democrat presumably out-of-turn. 

enough and that searching loses context, and aims to combine them in a single framework. However, its use 
of searching always precedes browsing and therefore limits the richness of supportable interactions. Out-of- 
turn interaction aims to provide precisely this combination of focused input and exploratory browsing in a 
single, flexible, framework. 

Our work can be viewed as complementary to these efforts in that it lifts the nature of interaction from 
in-turn to out-of-turn means (top of Figure |3l). For instance, the example session shown in Figure |2l can 
be viewed as a lifted version of traditional browsing, yielding a (high context, out-of-turn) technique that 
exploits partial information. While out-of-turn interaction can be studied in many settings, this paper only 
discusses its use in conjunction with browsing of levelwise, hierarchically organized sites. 

Out-of-tum interaction, especially of the unsolicited reporting nature, has been recognized as a simple 
form of mixed-initiative interaction f?]. Interleaving out-of-turn responses with in-tum clicks can be viewed 
as conversational shifts of initiative between the user and the website. 

2.2 Extempore 

We have built a user interface, called Extempore, that accepts out-of-tum input either via voice or text. The 
voice version was implemented using SALT 1.1 (Q; a standard that augments HTML with tags for speech 
input/output) and SRGS (Speech Recognition Grammar Specification), for use with Internet Explorer 6.0. 
The text version is a toolbar embedded into the Mozilla/Netscape web browser (vl.4) and was implemented 
using XUL (see Figure |4l)^. It is important to note that Extempore is embedded in the web browser, and 
not the site's webpages. It is also not a site-specific seaixh tool that returns a flat list of results (akin to the 
Google toolbar). Further, while seaixh engines index webpages. Extempore rather relies on an internal rep- 
resentation of the website and, when out-of-tum input is supplied, uses transformation techniques to stage 
the interaction, pruning the website accordingly. The details of the underlying software transformations 
are beyond the scope of this work; see, for example, Ricca and Tonella fT2ll . and Perugini and Ramakrish- 
nan flOl for ideas on transformation techniques. Extempore can be used for out-of-turn interaction in many 
web sites, given a representation of the site's structure, e.g., in XML. 

3 Exploratory Study 

Extempore is fundamentally different from existing approaches to customize browsing experiences; there- 
fore we conducted a study that exposes users to out-of-turn interaction, to determine if they utilize it in 
information-finding tasks, and their rationale for doing so. The main component of the study entailed ask- 
ing participants to perform eight specific information-finding tasks in the Project Vote Smart (PVS) web- 
site (http://www.vote-smart.org )^. Rationale was gathered through think-aloud and retrospective protocols. 

"Currently there is no SALT plugin for Mozilla (and likewise with XUL and IE). Due to these technological constraints, we do 
not support both interfaces of Extempore in the same implementation. 

''At the time this study was conducted, PVS employed a hardwired organization akin to Figure Q the site has been recently 
restructured into a flat faceted classification. 
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Figure 5: Minimum number of interactions (loglO scale) required to successfully satisfy each information- 
finding task using in-turn (dark) and out-of-turn (light) interaction. Note that Task F can be completed with 
just one out-of-turn interaction, so its entry in the graph shows zero. 



3.1 Goals 

The goal of the experiment was to study usage patterns for out-of-turn interaction, not to evaluate the inter- 
faces used to realize it, or to compare out-of-turn interaction with other interaction techniques. 

3.2 Participants 

We collected data from 24 participants in the analysis; all were students with an average age of 21, and 
a majority were undergraduates in computer science. Some of the participants were recruited from a HCI 
course, and were compensated with extra-credit from the instructor. Since a component of this experiment 
involved voice recognition software, we primarily recruited native speakers of English. Average participant 
computer and web familiarity and use was 4.75 or greater on a 5-point Likert scale. Average participant 
familiarity with voice recognition software was 1.46, and mean familiarity with the structure of the US 
Congress was 2.83; no user had visited the PVS website prior to the experiment. 

3.3 Tasks 

The eight tasks were carefully formulated to generate a diverse set of interaction choices: 

A. Find the webpage of the Junior Senator from New York. 

B. Find the webpage of the Democratic Representative from District 17 of Florida. 

C. Find the webpage of the Republican Junior Senator from Oregon. 

D. Find the webpage of the Democratic member of the House in Rhode Island serving district 2. 

E. Find the states which have at least one Democratic Senator. 

F. Find the states which have twenty or more congressional districts. 

G. Find the states which have at least one Republican member of the House. 

H. Find the political party of the Senior Senator representing the only state which has congresspeople from 

the Independent party. 

We refer to tasks A, B, C, and D as non-oriented tasks, in that they can be performed as easily by em- 
ploying solely in-tum interaction (i.e., in this case, hyperlinks), solely out-of-turn interaction (Extempore), 
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or using a mixture of both. Out-of-turn interaction does not appear to be worthwhile with respect to these 
tasks because the effort required to perform them with out-of-turn interaction is commensurate with that 
of in-turn interaction. Tasks E, F, G, and H are out-of-tum-oriented, because they are difficult to perform 
with only in-turn interaction. Formally, we say an information-seeking task is out-of-turn-oriented if the 
minimum number of browsing interactions required to successfully complete it exceeds the maximum depth 
of the targeted website; otherwise it is non-oriented. 

The maximum depth of the PVS site is four and Figure|5]illustrates the minimum number of interactions 
required per task. In calculating this minimum number, we assumed that the user can supply at most one 
aspect at each step (in-turn or out-of-turn), and discounted back button clicks (happens when employing only 
in-turn interaction for an out-of-turn-oriented task). Notice also that some tasks, namely the non-oriented 
ones, cannot be performed by purely a sequence of out-of-turn interactions; a terminal in-turn input is often 
necessary and these are discounted as well. For instance, try solving task A using purely out-of-turn inputs. 

3.4 Design 

The study was designed as a within-subject experiment. Task was the independent variable and the interac- 
tion observed (in-turn vs. out-of-turn) was the dependent variable. Participants were given both the toolbar 
and voice interface of Extempore; and performed four tasks with each (two non-oriented and two out-of- 
turn-oriented). We designed the experiment with the provision for interfaces in two different modalities, 
to more naturally assess the use of out-of-tum interaction independent of a particular interface for it. Each 
participant performed the eight tasks in an order pre-determined by a latin square to control for unmeasured 
factors. In addition, the specific interface to be used (toolbar or voice) for a (task, participant) pair was 
determined a priori by complete counterbalancing within each task category. Thus, for each task, half of the 
twenty-four participants were given the toolbar interface and half the voice interface. The participants were 
free to utilize any strategy to complete the information-finding tasks, given Extempore and the available 
hyperlinks; they were given unlimited time to complete each task. 

3.5 Configuring Extempore 

A vocabulary for the PVS site was created by collecting all link labels, synonyms (e.g., 'Representatives' 
for 'House'), and alternate forms of common utterances (e.g., 'Senate', 'Senator', 'Senators'). Both the 
toolbar and voice version of Extempore supported this vocabulary, with the toolbar supporting abbrevi- 
ations (e.g., CA for California), in addition. To keep users abreast of partial information supplied thus 
far (either by browsing or via Extempore), we continually updated an 'Input so far:' label in the browser sta- 
tus bar (see Figure|2li. We also included a provision for the user to inquire about what partial information is 
left unspecified at any step. Access to this feature is provided through a 'What May I Say?' button (labeled 
with a '?' in FigurelU or utterance. 

The semantics of out-of-turn interaction in information hierarchies required some practical implemen- 
tation decisions. For instance, when the user speaks 'Junior seat,' the specification of 'Senate' can be 
automatically inferred by functional dependency. Another form of such 'utterance expansion' occurs in re- 
sponse to single-valued options. For instance, in Figure |2l one can argue that the choice of seat at the last 
step is really unnecessary, as there is only one option left (Senior). When only one path remains among 
the available options, we vertically collapse them and directly present the leaf page. This feature was not 
illustrated in Figure |2l for ease of presentation, but we implemented it in our study. Notice, however, that 
no information is lost during such collapsing, since terminal pages in PVS identify all pertinent attributes of 
politicians. 
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3.6 Equipment, Training, and Procedures 

3.6.1 Equipment 

Participants performed the tasks on an Extempore-enabled Pentium III workstation, connected to a 17" 
monitor set at 2560x1024 resolution in 34-bit true color, running Windows 2000. We recorded a video 
of each participant performing the information-finding tasks using the Camtasia screen and audio capture 
software. The resulting capture was used to aid participant recollection during the retrospective verbal 
protocol as well as in subsequent analysis (e.g., think-aloud). The Audacity audio recording application was 
used during the retrospective portion of the experiment to capture participant explanations. Data from the 
pre-questionnaire (demographics, computer familiarity) and post-questionnaires (rationale) was recorded 
on paper. Finally at the end of the entire experiment we transcribed and collated the data gathered from all 
sources to construct a complete record of each participant session, including interaction sequences followed 
per task. Each participant session lasted approximately 90 minutes. 

3.6.2 Training 

Prior to revealing the information-seeking tasks, we gave users specific training on (i) the PVS website, 
including levels of classification, and interacting with it via hyperlinks; (ii) interacting with PVS using 
Extempore (both toolbar and voice interfaces); and (iii) interleaving hyperlink clicks with commissions via 
Extempore. Users were provided a card summarizing the vocabulary that Extempore can understand, as 
well as explanations of political terms and their functional dependencies. This card was available for their 
use during the entire session, not just training. We did not use terms such as 'in-turn' or 'out-of-tum' during 
training or elsewhere in the study. This is to prevent biasing of participants toward any intended benefits of 
Extempore, and also to help them conceptualize its functionality on their own. In other words, we simply 
trained users on how to employ the available interfaces (hyperlink and Extempore) for information seeking. 
After some self-directed exploration, users were given a short test consisting of four practice tasks (two with 
toolbar and two with voice). 

3.6.3 Procedures 

After the users completed the training tasks, we administered the actual test involving tasks A-H above, and 
employed both concuiTcnt (think-aloud) and retrospective protocols to elucidate rationale. A structured in- 
terview, including a post-questionnaire, was conducted to gather additional feedback. The entire experiment 
generated (24x8 =) 192 (participant, task) interaction sequences. 

These sequences were then analyzed for frequencies of usage of in-turn vs. out-of-tum interaction. 
For purposes of this study, we defined an in-turn interaction as a hyperlink click or the communication of 
in-turn partial information to the website via Extempore. Notice that just saying 'Connecticut' will not 
qualify as an out-of-turn interaction, if the same choice was currently available as a hyperlink. Similarly, we 
defined an out-of-turn interaction to be the submission of one aspect of unsolicited partial information to the 
site. Supplying more than one aspect of partial information to the site out-of-tum (e.g., saying 'Democratic 
Senators') corresponds to multiple out-of-turn interactions. 

Notice that a user may supply in-turn and out-of-turn information to the website simultaneously via 
Extempore. For instance, in the top-level page in Figure |2l the user might say 'House, Florida, District 
17, Democrat,' all at the outset. Observe that a permutation of this utterance exists — 'Florida, House, 
Democrat, District 17' — that, if conducted incrementally, could imply a purely in-turn interaction. Such an 
interaction is thus viewed as having four in-turn inputs. On the other hand, consider a user who says 'New 
York, Democrat' at the outset. There is no permutation with respect to the PVS site that permits viewing 
this utterance as comprising of purely in-tum input, and hence, it is classified as one in-turn input ('New 
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York'), followed by an out-of-turn input ('Democrat'). This policy of counting does not favor (and actually 
deprecates) out-of-turn interaction. 

Some users, after completing a given task via out-of-turn interaction, verified part of their results via 
in-turn interactions. This was confirmed through their retrospective feedback, and such in-turn interactions 
were discounted in the analysis. 

4 Results 

Of the 192 recorded interaction sequences, 177 of them involved the successful completion of the task by 
the participant. We analyze these 177 sequences first, followed by the remaining 15 sequences (which were 
all generated in response to out-of-tum-oriented tasks). 

4.1 General Usage Patterns 

Results indicate a high frequency of usage for out-of-turn interaction. 94.4% of the 177 sequences contained 
at least one out-of-turn interaction. In addition, every participant used out-of-turn interaction for at least 
70% of the tasks, with 16 people using it in all tasks. Conversely, every task was performed with out- 
of-turn interaction by at least 80% of the participants, with 4 tasks enjoying out-of-tum interaction by all 
participants. These results are encouraging because Extempore usage is optional and not prompted by any 
indicator on a webpage. Participants successfully completed the given tasks irrespective of the presented 
interface (voice or toolbar). 

4.2 Classifying Interaction Sequences 

The 177 interaction sequences were classified into five categories denoted by: (i) I, (ii) O, (iii) 10, (iv) 01, 
and (v) M. The I and O categories denote sequences comprised of purely in-turn, or out-of-turn inputs, 
respectively. In 10 sequences all in-turn inputs precede out-of-turn inputs (analogously, for 01). M se- 
quences ('mixed') are those which do not fall in the above categories. For instance, the interaction shown in 
Fig.[2would be classified under I, and that in Fig. His in 01. We posit that this classification provides insight 
into users' information-seeking strategies, and can be related to the nature of the information-finding task. 

Figure |6l shows the distribution of the 177 sequences into the five classes, and Table^depicts a break- 
down by both task orientation and classes. Notice that O, 01, 10, and mixed classes have been grouped in 
Table[2to distinguish them from pure browsing interactions (I). 

As Figure |6l shows, 10 of the 177 sequences fall in the I class, i.e., these are browsing sequences. As 
Table [2 (lower left) shows, all of the 10 browsing sequences were generated in response to non-oriented 
tasks, revealing that a 100% (81/81) of the sequences for out-of-turn-oriented tasks involved out-of-turn 
interaction. Therefore, 

• users never attempted to achieve an out-of-turn oriented task via browsing; or in other words, 

• users always employed out-of-turn interaction when presented with an out-of-turn-oriented task. 

This is notable because it confirms that users are adept at discerning when out-of-turn interaction 
is necessary. 

4.3 Detailed Analysis of Interaction Classes 

Let us now study the interactions in classes O, 01, 10, and M. The 69 pure out-of-turn sequences (O) 
were observed only in out-of-tum-oriented tasks E, F, and G, and was used by all the 24 participants. This 
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Table 1: Breakdown of 177 interaction sequences in various categories. The total number of interaction 
sequences for out-of-turn oriented tasks is 15 less than that for non-oriented tasks; these were the sequences 
where the participant did not complete the task successfully. 

clustering of the O sequences around three tasks shows that, whenever participants completed these tasks, 
they did so in the shortest manner possible. Refer again to Figure |5] for the sharp contrast in the length of 
the minimum out-of-turn sequence from the minimum in-turn sequence, for these tasks. 

Classes 10, 01, and M contain the sequences exhibiting rich interaction strategies. Classes 10 and 01 
were observed in near-equal numbers, and primarily in the non-oriented tasks (A, B, C, and D) with the 
exception of 01, which was also seen in task H. No particular clustering was observed with respect to 
participants. The 17 class M interactions exhibited only two types of patterns — 14 with an 010 form, and 
3 with an lOI form. Furthermore, like 01, these 17 mixed interactions also involved only the non-oriented 
tasks (A, B, C, D) and task H. It is interesting that we are observing 010 and lOI sequences, even in a site 
with only four levels. Once again, no specific clustering was observed around participants. 

To see if these classes correspond to specific information-seeking strategies, we plotted curves depicting 
the progressive nan^owing down to a desired congressional official, as a function of interaction steps. All 
curves begin at the (0, 540) point because the PVS site indexes all 540 congressional officials. With each 
interaction, this number is gradually reduced until the user arrives at the desired set of officials. However, 
we were unable to observe major correlations between curve slopes and strategies; this is because in the 
PVS site, the slope is primarily dependent on the nature of the task, not the strategy. For instance if a task 
involved a state like 'Rhode Island,' even an in-turn input of this state information will cause greater pruning 
than most out-of-turn inputs. To qualify interaction classes better, we must study out-of-tum interaction in 
more sites. 

4.4 Cascading Information across Subtasks 

Recall that 15 interaction sequences led to incorrect answers; interestingly 12 of these 15 were generated 
in response to Task H. Notice that Task H is challenging, because it involves two subtasks and cascading 
information found in one into the other. The user is expected to first find the only state having Indepen- 
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US Senate on 1 1/08/88. He 
is a ranking member of the 
Environment and Public Works 
Committee. ... 



Figure 7: Task H: the user is expected to first find 'Vermont' in one Interaction (third window from left) and 
use it as input in another interaction (shaded area) to find the party of the Senior Senator from that state. The 
two windows on the right depict unnecessary and irrelevant interactions for this task. 

dent congressional officials (Vermont), and then find the political party of the Senior Senator from that 
state (Democrat). In other words, this task requires procedural, not just declarative, knowledge (a distinc- 
tion motivated in the Strategy Hubs project l^l). 

Most people were adept at finding that Vermont was the desired state (e.g., by saying 'Independent' at 
the outset), but did not realize that the task cannot be completed by continuing that interaction. As FigureQ 
shows, clicking on the only available state link ('Vermont') now presents a choice of House vs. Senate. 
Clicking on Senate takes the user to the webpage of Jim Jeffords, who is the Junior Senator from Vermont, 
not the Senior Senator! 

Some users immediately realized the problem, as identified in their retrospective interviews, e.g.: 

"This question was tricky. Cause it was, I was like wait, if he's Independent then his party is 
Independent ... at first [I thought] it was the Senior Senator who was Independent . . . and I got 
this guy's webpage, and then I saw that he was the Junior ... So then I eventually went back to 
Vermont and got the [Senior] guy." 

Only 12 (50%) of the participants successfully completed this task. This result demonstrates that cascading 
information across subtasks is challenging. It was clear that all users wanted to continue the interaction, 
but some failed to realize that out-of-turn interaction as presented here is merely a pruning operator, and 
not constructive. Investigating the incorporation of constructive operators such as rollup/expansion is thus a 
worthwhile direction of future research. 

4.5 Rationale and Qualitative Observations 

Studying users' rationale revealed their reasons for interacting out-of-tum: 
"I can jump through all the levels " 

"Initially I thought I would prefer the hyperlinks . . . after reading the questions, it became ap- 
parent that the toolbar and voice interface would simplify the task." 

". . . when you wanted to know all the states for the Republicans, then you would have to click 
on every single Unk. It would just get annoying after a while. You'd just give up I think. There'd 
be no way." 

"I guess I would have had to . . . wow, check every state." 
demonstrated understanding of how Extempore works (e.g., input expansion): 
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"Its the easiest way cause there is only one Representative from District 17 in Florida and it 
takes you straight to the page." 

"If you click on the state then you get choices of House and whatever, but if you type in district 
2 and it just goes right there." 

presented advantages and judgments: 

". . . allowed multiple pieces of information to be input at one time." 

"As much surfing as I do, it sort of makes me wish I had those options sometimes ya know 
instead of going to search engines and fooling ai^ound . . . having to come up with different 
search criteria " 

and also brought out frustrations: 

"The voice interface feels a little awkward since I am not used to talking to myself " 

"I don't always trust the results, [so I went back] confirming that they are all republican." 

Many users learned that out-of-turn interaction is best suited when they have a specific goal in mind, 
and not meant for exploratory information-seeking (as is browsing). For instance, 

"if I wanted to go the whole way down to a specific person, I would use [Extempore], but if I 
was just looking around, I would use the Unks." 

"[Extempore] is good when you know the site and know you have to go several layers deep. 
Links [are good] when you don't know the layout or don't know exactly what you want." 

5 Discussion 

Extempore enables a novel approach to interact with websites. Users with out-of-tum partial input can 
employ Extempore to enhance their browsing experiences. Thus, out-of-turn interaction is intended to 
complement browsing, and not replace it. For designers. Extempore augments their sites with capabilities 
for personalized interaction, without hardwiring in-turn mechanisms (as is commonly done). In addition, 
since usage of Extempore is optional, it preserves any existing modes of information-seeking. 

There are significant lessons brought out by our study, which we only briefly mention here. This work 
validates our view of web interaction as a flexible dialog and shows that users actively interleaved out-of-turn 
interaction with browsing. Importantly, users were proficient at determining when out-of-turn interaction 
is called for. Studying the rationale and usage patterns has generated a body of knowledge that can be 
used, among other purposes, for introducing out-of-turn interaction in new settings and to new participants. 
Furthermore, we have seen that it is easy to target out-of-turn interaction in domains where tasks involve 
combinations of focused and exploratory behavior. Recall also that dialogs with purely declarative specifi- 
cations are readily supported; others such as Task H will require further study. 

Out-of-tum interaction is most effective when users have a basic understanding of the application do- 
main and know what aspects are addressable. When users do not know what to say fT5\, our facility to 
enquire about legal utterances may induce information overload in large sites. While we have not en- 
countered this problem in our PVS study, we suspect that applying out-of-turn interaction in large web 
directories (e.g., ODP) will involve new research directions. 
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